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lul ^TftSS/Lrw *-hi^ JP*-«ificdtion is to present the justification for 
tie pox/EBOX ■ulti-processor design and to describe their functional 
relationship to each other and memory. The Dolohin CPU will consist of 
tro mcrocoded machines that will exist in a tightly coupled 
multi-processor environment. Each processor will have independent 
mictocode and execution. They will not however, be capable of executing 
the complete PDP-10 instruction set without each other. The combination 
of these two processors will provide the complete instruction execution 
capr^bility Of the Dolphin CPU. 



1.1 Description 

The IBOX/EBOX tasks will be separated as follows: 

1. The IBOX will handle all instruction fetches and effective 
address calculations. 

2. The EBOX will do all remaining computation and result storing. 

3. The IBOX will write no results. 

JJe IBOX data path will consist of a 36-bit limited function ALU and the 
minimun number of registers needed to calculate effective addresses. In 
addition to these capabilities, the IBOX will be capable of completing 
any instruction that only affects the PC (skips, jumps, test, etc.). 



1.2 Rationale 

Justification for using two micro machines was made from various 
benchmarks designed to study instruction frequencies and instuction pair 
occurances. Data from these tests is available for study if desired. 
The most significant data point showed that approximately one-third of 
all instuctions executed stored no results, in either hCs or memory. 
The only thing affected by these instructions was the PC, therefore, a 
separate processor could handle this class of instruction. 

The next major factor in deciding to use two processors was a program 
written by Mike Newman designed to study the distribution of conflicts 
between the results of an instruction and the effective address 
computation of the next instruciton. This program showed that less than 
151 of all PC, index register, and indirect word fetches conflicted with 
^the previous instructions results, 

Jiven the preceding information, we then constructed and ran under 

■|mulation an IBOX/EBOX configuration. A test was performed on a 

tTDVE/ADD sequence with a built in conflict. The results showed 19 clock 

ticks vs. 27 clock tici.s on the KSia (similar data path). This very 

limited test showed that indeed we could obtain substantial overlap. 
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nicJsfrv'^Jn ^^^^i:P"N CPU will be built from l.Sns gates .ade it 

S^s r" LerofTs tiiL ir .F?r""'°^ arrangment in order to ^eet the 
j-.toic^i goei or i.b times the KLIB speed. 



1.3 I80X Data Path 

S?dPr''^n^L'''^r"^L''''?"^'='^'''"^ ^" ^^^ ^^0^ ^^ta path are necessary in 
?o do th2 fn^P;"-^ ^^ff,S^'^! addresses. In addition to this, the ability 
to do the functions AWD and SUBTRACT in the ALU were necessary for Test 
and Compare instructions. ^ 

bran?hinf'^ ^^ i^^^^f/o*^ ^^th addressing the Ram file and microcode 
oranchmg. Any of 4 items can be selected onto this bus. The indirect 
word register (IW) has 2-five bit fields that are used for the index 
p^?^^^®'' ^*J^[ selection and the indirect bit in each the Instruction 
Format and Extended Format Indirect Word. The low-order bits of the 
for"fi^rMnn^^Ko ''?! ^^ Selected onto this bus for conflict compares and 
tor fetching the effective address from the ACs. The AC field of the 
opera"d^f°t ^"^'^^^'^ is also available on this bus for conflicts and AC 



W IBOX/EBOX Interface 

resultf i-n^?if?«nv "ii^^ "^^^^ registers- are used to comunicate 
results to the EBOX. They are as follows: 

1. Last IR/Last AC - This register contains the opcode and AC of 
tne instruction that has been passed to the EBOX for further 
processing. 

^' l^^^ ^ ' , '^^^^ register contains the calculated Effective 
Address (EA) of the instruction that has been passed to the 
EBOX for further processing. 

3. Last PC - This register contains the Program Counter (PC) of 
the instruction that has been passed to the EBOX. 

These registers have a four 6-bit comparators conntected to them in 
order to provide the conflict logic for operand fetching. They compare 
the contents of the AC addrs bus with Last AC, Last AC+1, E and E+1 
simultaneously. The enables for the comparators come from the Dispatch 
Ram and are based on what the instruction is expected to write as 
results. The results written classes are: 

1. None - Instruction writes no results (no conflict). 

2. Other - Instruction writes in unspecified locations (guaranteed 
conflict) . 
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3. AC - Instruction stores in AC. 

4. AC and AC+1 ~ Instructions stores in AC <!ind AC-»-l. 

5. E - Instructions stores in Effective Address. 

6. E and E+1 - Instruction stores in EA and EA+1 

7. Both - Instruction stores in AC and EA. 

8. Both and AC+1 - Instruction stores in EA , AC and AC+1. 

Theso registers are loaded on a "load last" aignal from the IBOX 
microcode. An interlock mechanism on the "last" registers will prevent 
the IBOX from continuing until the EBOX clears this interlock, but only 
when the IBOX attempts to load new data in these registers. At the end 
of the load cycle, the EBOX will be able to dispatch on this condition 
(instruction available) . 

The EBOX will have the capability to grab and start the IBOX at any 
arbitrary raicrcode location. In addition it can pass any single data 
item (36-bits) in location 377 (8) of the Ram File. The IBOX will have a 
special function to specifically read this datum. 

.^addtional interlock mechanism between the IBOX and the EBOX will 
Mst for the IBOX to be able to pass data (36-bits at a time) to the 
^TOX via the Instruction Register (IR) . The implementation of the 
interlock will be similiar to the "last" register mechanism. 

The IBOX will also have the capability of ignoring "conflict" in order 
to speed up certain classes of instructions which the IBOX can fetch and 
decode the operands that it knows will not be written by the EBOX. 



1.5 EBOX Data Path 

The EBOX data path consists of a 16 word, two port register file, an 
adder, a shift matrix, a 256 word accumulator RAM, a byte pointer 
manipulator, and various multiplexors connecting these parts. The 
attached diagram shows the interconnections. 

The data -path is 36 bits wide. With the exception of floating point, 
all the instructions in the PDP-10 instruction set can be easily 
implemented using a 36-bit wide data path. The KS10 uses a 36-bit wide 
path and has trouble only with floating point arithmetic, particularly 
double precision. The 72-bit paths in the KL10 were added to make the 
floating point arithmetic fast. Since fast floating point is a goal 
^ily of the Dolphin floating point accelerator and not of the basic 

i^chine, the narrow data path will reduce costs without compromising 

"" teiii goails. 



-m 



There will be no ■10-bit data path" for the aanipulation of floating 
point exponents. On the KL10, this logic was needed to »ake floating 
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^n^ceslary^^ln* thr\a8lc^'i2?K^^^"^ accelerator .ake. this logic 
•anipulation M byte pointers anS^nA. \«^;P^r ^'^^^^ <»«ta path for ?he 
the KLlfs 10-bit pith? ^ "''''^'''^ °^ ""^^ ^^'^^ "trix replaces 

?Se 'tlU' bTV liiiT '?.''^ microcode word will be 18 bits wide 
KS19 18-bit%ield pr'oJfd Sucr^or'r^CrelSJ""^'^ """°- '' ^^•»-^' *"^ 'h; 
i^-ple.enting functions for"in\r:^:enUy'uL'harde:re'!^"^^' ""'^' *"^ 

""^h'^'^'r^^^^^^^^ BR, and BRX 

The register file can be "rit^en^v h^?! ^ ""'I^ regularly connected, 

be written with zeros This m^k.«^.K °^*'^*"*^ ^^^^""^ ^^^^ ^^^^ can 

trivial. "'^''^^ ^^^ common half word instructions 

The KLlB's VMA and MQ registers are r-o-M^-^ * .. 

With the register fill! the onlv t?«! Jh!^*n*"''^ "^"^ ^^ register. 

the multiply and divide algo'ltSSsteos At ItTJtl '° ^l ""^^"^ ^^ ^*^^ 
active, so the funrHr>nai < ♦-.. steps. At these times the VMA is not 
no performance loss! ''^ "'' ^^ combined giving a cost savings with 

^ith1ftLr"L'ta°input^o:pL':e^?e'd''%??'H t''' -"^' "-' -^ i- ' 

;ithmetic func?iorarn?ia:SiT * Au'of the' bo^f "''J ^"-"^ ""^'"^ 
lailable. avdiiaoie. All of the boolean functions are 



the shitiing p;ths «cLs"y'fof"bt at ."?:'" 'ilV , ""•'' "'"'" 
times-ten path take no ..Jrl J» . L. ' '*"* "ultlply and the 

retrj^looks* ?^r'.r"?''"'^'"^' "°^^ P""^ checkln,. Instruction 
S?BFr «°« ca° def«Slie"ha; ther!'!."""?"' """ "J '^"'"^ «''" '" 
-ill examine this lu"h« *"^* '"°"«*' ""^ 9»'"' « 



1.6 



Interface To MBOX (see Don Lewine Hemo Of 21-june-78) 
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^.7 IBOX Data Path Diagram, Attached. 

1.8 EBOX Data Path Diagram, Attached. 

1.9 MOVE/ADD/ J RST Timing Diagram, Attached. 

l.li Fortran Accelerator Data Connections Diagram, Attached. 
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